
United States Patent and Trademark Office 



UNITED STATES DEPARTMENT OF COMMERCE 
United States Patent and Trademark Office 
Address: COMMISSIONER FOR PATENTS 
P.O.Box 1450 

Alexandria, Virginia 22313-1450 
www.uspto.gov 



APPLICATION NO. 



FILING DATE 



FIRST NAMED INVENTOR 



ATTORNEY DOCKET NO. 



CONFIRMATION NO. 



09/921,766 



08/03/2001 



Philippe R. Morin 



27572 7590 03/24/2004 

HARNESS, DICKEY & PIERCE, P.L.C. 
P.O. BOX 828 

BLOOMFIELD HILLS, MI 48303 



9432-000141 



8751 



EXAMINER 



LERNER, MARTIN 



ART UNIT 



PAPER NUMBER 



2654 

DATE MAILED: 03/24/2004 



Please find below and/or attached an Office communication concerning this application or proceeding. 



PTO-90C (Rev. 10/03) 



} 



WT 

/ / 

Office Action Summary 


Application No. 

09/921,766 


Applicant(s) 
MORIN ET AL 


Examiner 
Martin Lemer 


Art Unit 

2654 





- The MAILING DATE of this communication appears on the cover sheet with the correspondence address ~ 
Period for Reply 



A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

• If the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- If NO period for reply is specified above, the maximum statutory period wilt apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 1 33). 
Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status 

1 )|EI Responsive to communication(s) filed on 30 January 2004 , 
2a® This action is FINAL. 2b)Q This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 11, 453 O.G. 213. 

Disposition of Claims 

4) ^ Claim(s) 1, 2, 4 to 14. and 16 to 22 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) D Claim(s) is/are allowed. 

6) |El Claim(s) 1, 2, 4 to 14. and 16 to 22 is/are rejected. 

7) D Claim(s) is/are objected to. 

8) D Ciaim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) D The specification is objected to by the Examiner. 

10)£3 The drawing(s) filed on 30 January 2004 is/are: a)^ accepted or b)D objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1 .85(a). 

Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 
1 1 )D The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-1 52. 

Priority under 35 U.S.C. § 119 

12)D Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 19(a)-(d) or (f). 
a)D All b)D Some * c)D None of: 

1 .□ Certified copies of the priority documents have been received. 

2. D Certified copies of the priority documents have been received in Application No. . 

3. D Copies of the certified copies of the priority documents have been received in this National Stage 

application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 



Attach ment(s) 

1) ^ Notice of References Cited (PTO-892) 

2) Q Notice of Draftsperson's Patent Drawing Review (PTO-948) 

3) □ Information Disclosure Statement(s) (PTO-1449 or PTO/SB/08) 

Paper No(s)/Mail Date . 



4) O Interview Summary (PTO-413) 

Paper No(s)/Mail Date. . 

5) CD Notice of Informal Patent Application (PTO-1 52) 

6) □ Other . 



U.S. Patent and Trademark Office 
PTOL-326 (Rev. 1-04) 



Office Action Summary 



Part of Paper No./Mail Date 6 



Application/Contromumber: 09/921 ,766 Page 2 

Art Unit: 2654 



DETAILED ACTION 

Claim Rejections - 35 USC §112 

1 . The following is a quotation of the first paragraph of 35 U.S.C. 112: 

The specification shall contain a written description of the invention, and of 
the manner and process of making and using it, in such full, clear, 
concise, and exact terms as to enable any person skilled in the art to 
which it pertains, or with which it is most nearly connected, to make and 
use the same and shall set forth the best mode contemplated by the 
inventor of carrying out his invention. 

2. Claims 1 to 22 are rejected under 35 U.S.C. 112, first paragraph, as failing 
to comply with the written description requirement. The claims contains subject 
matter which was not described in the specification in such a way as to 
reasonably convey to one skilled in the relevant art that the inventors, at the time 
the application was filed, had possession of the claimed invention. 

Independent claims 1 and 10 contain new matter. The limitation "a 
sequence of the recognized values echoed in the audio feedback reflects a 
sequence of the spotted words within the input utterance" is not described in the 
specification in such a way as to reasonably convey that the inventors had 
possession of the claimed invention at the time the application was filed. There 
is no express disclosure in Applicants' Specification of the sequence of values 
reflecting a sequence within the input utterance. There is no language in the 
Specification as originally filed about "reflecting" or "sequence". Nor would one 
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skilled in the art find any inherent disclosure in the Specification for the claimed 
limitation. 



Claim Rejections - 35 USC § 102 

3. The following is a quotation of the appropriate paragraphs of 35 
U.S.C. 102 that form the basis for the rejections under this section made in this 
Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this 
or a foreign country or in public use or on sale in this country, more than 
one year prior to the date of application for patent in the United States. 

4. Claims 1, 2, 6, 8 to 10, 13, 14, 17, and 19 to 20 are rejected under 35 
U.S.C. 102(b) as being anticipated by Takebayashi et al. 

Regarding independent claim 1, Takebayashi et al. discloses a method of 
data entry by voice, comprising: 

"communicating an input utterance from a speaker to a speech recognition 
means" - (column 8, lines 46 to 55: Figure 2); 

"spotting a plurality of spotted words of at least two recognized spoken 
words within the input utterance, wherein the spotted words form a phrase 
containing at least one of field-specific values and commands" - keyword 
detection unit 21 (column 8, line 55 to column 9, line 22: Figure 2); keywords are 
received in a word lattice or frame format ("field-specific values"), e.g. "three" 
"hamburgers" (column 10, lines 6 to 17: Figure 4); keywords include commands 
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such as "order", "cancellation", and "replacement" commands (column 10, lines 
18 to 24: FigureS); 

"echoing at least one of recognized values and commands back to the 
speaker via a text-to-speech system, wherein audio feedback echoing at least 
one of recognized values and recognized commands is performed upon 
interpretation of each input utterance, and a sequence of the recognized values 
echoed in the audio feedback reflects a sequence of the spotted words within the 
input utterance" - response generation unit 13 (column 7, lines 23 to 43: Figure 
2; column 17, lines 61 to 65); the multimodal response output generated such 
that the speech response for the confirmation message of "Your orders are one 
hamburger, two coffees, and four large colas, right?" is outputted from the 
loudspeaker unit 15 (column 13, lines 41 to 50: Figure 12C); an order contains 
both "values" and "commands", as the values are the numbers and types of each 
item ordered, and the order is a command to provide the items ordered; for a 
situation in which one hamburger and one cola has already been ordered, the 
confirmation is "Your orders are one hamburger and one cola, right?" (column 22, 
lines 18 to 25: Figure 30B); order confirmation messages are "audio feedback 
echoing" values and commands; for an order of one hamburger and one cola, the 
audio confirmation message "reflects a sequence of the spotted words in the 
input utterance" by identifying a quantity associated with each item ordered; 

"rejecting unreliable or unsafe input for which a confidence measure is 
found to be low" - (column 13, lines 6 to 10; column 20, lines 35 to 58; column 
24, lines 18 to 44; column 25, lines 24 to 37); 
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"maintaining a dialogue history enabling editing operations and correction 
operations on all active fields" - (column 6, lines 50 to 57); editing operations and 
correction operations include "addition", "cancellation", and "replacement" 
(column 10, lines 18 to 24: Figure 5). 



Regarding independent claim 10, Takebayashi et ai discloses an article of 
manufacture for data entry by voice, comprising: 

"an operating system" - processing unit 291 contains an operating system 
(column 29, lines 49 to 56: Figure 45); 

"a memory in communication with said operating system" - memory 292 
(column 29, lines 29 to 56: Figure 45); 

"a speech recognition means in communication with said operating 
system" - speech understanding unit 11 (column 6, lines 44 to 50: Figure 1); 

"a speech generation means in communication with said operating 
system" - response generation unit 13 (column 7, lines 23 to 43: Figure 1); 

"a dialogue history maintenance means in communication with said 
operating system" - (column 6, lines 50 to 57); 

"wherein said operating system manages said memory, said speech 
recognition means, said speech generation means, and said dialogue history 
maintenance means in a manner permitting the user to monitor speech 
recognition of an input utterance by means of a generated speech corresponding 
to at least one of field-specific values and commands contained within the phrase 
formed by spotted words within the input utterance, and to perform editing 
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operations and correction operations on all active fields, wherein audio feedback 
echoing at least one of recognized values and recognized commands is 
performed upon interpretation of each input utterance, and a sequence of the 
recognized values echoed in the audio feedback reflects a sequence of the 
spotted words within the input utterance" - keyword detection unit 21 (column 8, 
line 55 to column 9, line 22: Figure 2); keywords are received in a word lattice or 
frame format ("field-specific values"), e.g. "three" "hamburgers" (column 10, lines 
6 to 17: Figure 4); keywords include commands such as "order", "cancellation", 
and "replacement" commands (column 10, lines 18 to 24: Figure 5; column 6, 
lines 50 to 57); the multimodal response output generated such that the speech 
response for the confirmation message of "Your orders are one hamburger, two 
coffees, and four large colas, right?" is outputted from the loudspeaker unit 15 
(column 13, lines 41 to 50: Figure 12C); an order contains both "values" and 
"commands", as the values are the numbers and types of each item ordered, and 
the order is a command to provide the items ordered; for a situation in which one 
hamburger and one cola has already been ordered, the confirmation is "Your 
orders are one hamburger and one cola, right?" (column 22, lines 18 to 25: 
Figure 30B); order confirmation messages are "audio feedback echoing" values 
and commands; for an order of one hamburger and one cola, the audio 
confirmation message "reflects a sequence of the spotted words in the input 
utterance" by identifying a quantity with each item ordered. 
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Regarding claims 2 and 14, syntactic and semantic analysis unit 21 
determines keywords by semantics (column 6, lines 44 to 50; column 9, lines 38 
to 50). 

Regarding claims 6, 9, 17, and 20, correction commands include 
"cancellation" commands for deletion of a last entry, e.g. "That's Wrong" and 
"Cancel" (column 10, lines 18 to 24: Figure 5) and deletion confirmation (Figure 
15B). 

Regarding claims 8 and 19, editing operations include "replacement" 
commands "Rather" and "Instead" (column 10, lines 18 to 24: Figure 5) and 
replacement confirmation (Figure 15B). 

Regarding claim 13, response generation unit 13 generates the speech 
response in a synthesized voice (column 7, lines 23 to 43: Figure 1). 



Claim Rejections - 35 USC § 103 

5. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for 

all obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically 
disclosed or described as set forth in section 102 of this title, if the 
differences between the subject matter sought to be patented and the 
prior art are such that the subject matter as a whole would have been 
obvious at the time the invention was made to a person having ordinary 
skill in the art to which said subject matter pertains. Patentability shall not 
be negatived by the manner in which the invention was made. 

6. Claims 4, 5, 11, 12, and 16 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Takebayashi et a/, in view of LaRue. 
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Concerning claims 4 and 16, Takebayashi et al. omits automatic 
adaptation after a form is filled in and sent for search in a database. However, it 
is generally well known to provide adaptation to a user's voice for a voice 
recognition system during downtime once a processing session is completed. 
LaRue teaches automatic adaptation of a word recognition procedure to 
individual users. (Column 3, Lines 39 to 42; Column 10, Lines 64 to 67; Column 
13, Lines 28 to 30) It would have been obvious to one having ordinary skill in the 
art to perform automatic adaptation as suggested by LaRue after conclusion of 
an ordering session in Takebayashi et al. for the purpose of adapting a voice of 
an individual user when the processor is not active. 

Concerning claims 5, 11, and 12, Takebayashi et al. omits a backup input 
system as a keyboard or touch screen. However, LaRue teaches a speech 
recognition system including a keyboard and an input panel 36 to enhance the 
ability to communicate audibly in a man-machine interaction. (Column 1, Lines 
19 to 27: Column 4, Lines 36 to 39; Column 13, Lines 62 to 63: Figure 2) 
Including an additional input device in a speech recognition system is generally 
well known for the purpose of providing flexibility by permitting a plurality of 
modes of input or when one input device fails to operate. It would have been 
obvious to one having ordinary skill in the art to include a backup input system as 
a keyboard or input panel as taught by LaRue in the human-computer interaction 
system of Takebayashi et al. to improve and enhance the flexibility of a man- 
machine interaction for a speech recognition system. 
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7. Claims 7 and 18 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Takebayashi et al. in view of Cornelison. 

Takebayashi et al. omits letters and numbers for a license plate as field- 
specific values. However, Cornelison teaches a parking ticket enforcement 
system allowing for the search of license plates by key words of letters and 
numbers through voice input from a police officer. (Column 7, Line 1 1 to Column 

8, Line 39) This is desirable to provide a police officer on duty the capability of 
conveniently and effectively determining whether or not an observed vehicle has 
been associated with criminal activity. (Column 1, Lines 39 to 48) It would have 
been obvious to one having ordinary skill in the art to apply the word lattice and 
frame format in the voice data entry of Takebayashi et a/, to recognize letters and 
numbers of a license plate as taught by Cornelison for the purpose of providing a 
police officer on duty the capability of conveniently and effectively determining 
whether or not an observed vehicle has been associated with criminal activity. 



8. Claims 21 and 22 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Takebayashi et al. in view of Richards. 

Takebayashi et al. omits full duplex dialogue interaction with speech 
recognition and auditory feedback. However, full duplex interaction is well known 
for interactive voice response (IVR) systems, generally. Particularly, Richards 
teaches a sound card for analogous art game software, where the sound engine 
is capable of running in a full duplex mode to generate sound while concurrently 
receiving spoken utterances. (Column 6, Lines 39 to 56: Figure 1B) It is 
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suggested that full duplex capability provides greater flexibility for interactive 
voice response (IVR) systems so that a user need not wait for the system to 
cease generating sound before the user begins to talk. It would have been 
obvious to one having ordinary skill in the art to utilize full duplex dialogue 
interaction with speech recognition and auditory feedback as suggested by 
Richards in the speech dialogue system of Takebayashi et a/, for the known 
purpose of providing greater flexibility for interactive voice response (IVR) 
systems. 



Response to Arguments 

9. Applicants' arguments filed 30 January 2004 have been fully considered 
but they are not persuasive. 

Firstly, Applicants maintain support for the new limitations as originally 
filed with claims 3 and 15 is found in paragraphs [0005] and [0024] - [0025] of 
the Specification. This position is traversed. 

The Specification was fully reviewed, but support for the new limitations is 
not found therein. Original claims 3 and 15 only state audio feedback is 
performed. These claims do not say anything about "a sequence of the 
recognized values echoed in the audio feedback reflects a sequence of the 
spotted words within the input utterance". Nor do the passages cited by 
Applicants expressly disclose a sequence of recognized values reflecting a 
sequence of words in the input utterance. Paragraph [0024] only discloses "a 
tightly coupled dialogue". Paragraph [0025] only states the entry is echoed as 
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output speech, but neither expressly nor inherently suggests anything about the 
audio feedback reflecting the sequence of recognized values. Thus, these 
limitations are new matter. 

Secondly, Applicants maintain Takebayashi et al. does not teach 
performance of audio feedback echoing recognized values or recognized 
commands upon interpretation of each input utterance, such that the sequence of 
the recognized values echoed in the audio feedback reflects a sequence of the 
spotted words within the input utterance. Applicants contend Takebayashi et al. 
only discloses an "appropriate response output", such as "Would you like fries 
with that?" Applicants admit that, in some instances, Takebayashi et al. 
discloses a confirmation of contents of the order, such as a "hamburger" and a 
"cola", but say it is not clear whether the user stated "hamburger" and then "cola", 
or stated "cola" and then "hamburger". Thus, Applicants argue there is no 
evidence to support a preservation of sequence of the entry data in Takebayashi 
et al. Applicants also cite the ordering process of adding two "hamburger" and 
two "coffees" to the form, noting Figure 30B, where "coffees" are arbitrarily added 
in between previously ordered items. Applicants admit there may be occasions 
where Takebayashi et al. does manage to echo the values in the order they were 
received, but says this occurrence is coincidental at best. In contrast, Applicants 
say the sequence of entry is preserved for license plate form fields in their 
invention. These arguments are traversed. 

Takebayashi et al. anticipates the claimed invention of audio feedback 
echoing recognized values and recognized commands in a manner so as to 
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reflect the sequence of words in the input utterance. Basically, Applicants are 
selectively citing irrelevant passages from Takebayashi et a/., although 
Applicants do fairly admit there may be occasions where Takebayashi et a/, does 
manage to echo the values in the order they were received. Significantly, it is not 
necessary for Takebayashi et al. to echo back every item in the order in the exact 
manner it was spoken to anticipate the claim language of echoing to reflect the 
sequence of words in the input utterance. For example, if at any point the user 
says "three coffees", associating "three" and "coffees" in the audio feedback 
meets the claim limitation, even though the order may also include "two 
hamburgers" and "one cheese burger". Any association of a quantity with an 
item in an original order "reflects the sequence" of words in the input utterance. 

Moreover, Takebayashi et al. anticipates the claimed invention for any 
original order. Applicants wish to draw attention away to the exceptional situation 
where the user changes his/her order by adding or deleting items to show the 
association of items and quantities of the input utterance is not preserved in 
Takebayashi et al. However, the main embodiment of Takebayashi et al. is 
relatively simple: The user places an order and the system echoes back the 
order. At Column 13, Lines 36 to 50, the user orders one hamburger, two 
coffees, and four large colas; the system echoes back an audio response, saying 
"Your orders are one hamburger, two coffees, and four colas, right?" See 
Figures 12A to 12C. Each original order preserves the relationship between the 
quantity of the item and the item ordered in the echoed audio speech response. 
Similarly, Column 22, Lines 4 to 25, says a user's original order is one 
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hamburger and one cola; the system echoes back a response, saying "Your 
orders are one hamburger and one cola, right?" See Figures 30A and 30B. 
Finally, Figures 16, 18, 21 , and 22A to 22D of Takebayashi et a/, provide for 
partial confirmation of orders and for one by one confirmation. One by one 
confirmation provides confirmation of each item ordered, so that when an order 
includes two hamburgers, the system confirms, "Let me confirm one by one. You 
want two hamburgers, right?" One by one confirmation preserves the sequence 
of the quantity "two" and the item "hamburgers". Thus, Takebayashi et a/, 
discloses a number of embodiments where a sequence of recognized values 
within the input utterance, i.e. the quantities and items, is reflected in the echoed 
audio feedback. 

Therefore, the rejections of claims 1 to 22 under 35 U.S.C. 112, 1 st If; of 
claims 1,2,6, 8 to 10, 13, 14, 17, and 19 to 20 under 35 U.S.C. 102(b) as being 
anticipated by Takebayashi et a/.; of claims 4, 5, 1 1, 12, and 16 under 35 U.S.C. 
103(a) as being unpatentable over Takebayashi et al. in view of LaRue; of claims 
7 and 18 under 35 U.S.C. 103(a) as being unpatentable over Takebayashi et al. 
in view of Cornelison\ and of claims 21 and 22 under 35 U.S.C. 103(a) as being 
unpatentable over Takebayashi et aL in view of Richards, are proper. 

Conclusion 

10. The prior art made of record and not relied upon is considered pertinent to 
Applicants' disclosure. 
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Mostow et al. also discloses full duplex for interactive voice response 
(IVR) systems. (Column 8, Lines 6 to 13) 

1 1 . Applicants 1 amendment necessitated the new ground(s) of rejection 
presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. 
See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as 
set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire 
THREE MONTHS from the mailing date of this action. In the event a first reply is 
filed within TWO MONTHS of the mailing date of this final action and the advisory 
action is not mailed until after the end of the THREE-MONTH shortened statutory 
period, then the shortened statutory period will expire on the date the advisory 
action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be 
calculated from the mailing date of the advisory action. In no event, however, will 
the statutory period for reply expire later than SIX MONTHS from the date of this 
final action. 

Any inquiry concerning this communication or earlier communications from 
the examiner should be directed to Martin Lerner whose telephone number is 
(703) 308-9064. The examiner can normally be reached on 8:30 AM to 6:00 PM 
Monday to Thursday. 

If attempts to reach the examiner by telephone are unsuccessful, the 
examiner's supervisor, Richemond Dorvil can be reached on (703) 305-9645. 
The fax phone number for the organization where this application or proceeding 
is assigned is (703) 872-9306. 
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Any inquiry of a general nature or relating to the status of this application 
or proceeding should be directed to the receptionist whose telephone number is 
(703) 305-4700. 
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